SemanticScuttle - klotz.me » Tags: bloom filter

dablooms - an open source, scalable, counting bloom filter library This bookmark is certified by an admin user.

2015-04-23 Tags: bloom filter, bitly, github by klotz

Fast, easy, realtime metrics using Redis bitmaps | This bookmark is certified by an admin user.

2014-05-23 Tags: bloom filter, vldb, redis by klotz

Generic Load/Save Functions - Spark 3.2.0 Documentation This bookmark is certified by an admin user.

usersDF.write.format("orc")
.option("orc.bloom.filter.columns", "favorite_color")
.option("orc.dictionary.key.threshold", "1.0")
.option("orc.column.encoding.direct", "name")
.save("users_with_options.orc")
Find full example code at "examples/src/main/scala/org/apache/spark/examples/sql/SQLDataSourceExample.scala" in the Spark repo

2021-12-01 Tags: spark, orc, bloom filter, parquet, hadoop by klotz

GitHub - willf/bloom: Go package implementing Bloom filters This bookmark is certified by an admin user.

2018-02-22 Tags: golang, bloom filter, github by klotz

High Scalability - High Scalability - Big Data Counting: How to count a billion distinct objects using only 1.5KB of Memory This bookmark is certified by an admin user.

2014-05-23 Tags: bloom filter, vldb by klotz

Hive Optimizations with Indexes, Bloom-Filters and Statistics – Technology Snippets by Jörn Franke This bookmark is certified by an admin user.

2017-03-08 Tags: orc, optimization, bloom filter, index, hadoop by klotz

HyperLogLog: the analysis of a near-optimal cardinality estimation algorithm This bookmark is certified by an admin user.

2014-05-23 Tags: bloom filter, hashing, vldb by klotz

Less Hashing, Same Performance: Building a Better Bloom Filter This bookmark is certified by an admin user.

A standard technique from the hashing literature is to use two hash functions h1(x)
and h2(x)to simulate additional hash functions of the form gi(x) = h1(x)+ih2(x). We demonstrate
that this technique can be usefully applied to Bloom ﬁlters and related data structures. Speciﬁcally,
only two hash functions are necessary to effectively implement a Bloom ﬁlter without any loss in
the asymptotic false positive probability. This leads to less computation and potentially less need for
randomness in practice.

2014-02-05 Tags: bloom filter by klotz

Less Hashing, Same Performance: Building a Better Bloom Filter This bookmark is certified by an admin user.

2016-09-15 Tags: bloom filter, data structures, performance, harvard by klotz

Less Hashing, Same Performance: Building a Better Bloom Filter This bookmark is certified by an admin user.

Adam Kirsch,* Michael Mitzenmacher†

2020-03-16 Tags: bloom filter, hash by klotz

SemanticScuttle - klotz.me

Tags: bloom filter*

Linked Tags

Related Tags